49 research outputs found

    On the learnibility of Mildly Context-Sensitive languages using positive data and correction queries

    Get PDF
    Con esta tesis doctoral aproximamos la teoría de la inferencia gramatical y los estudios de adquisición del lenguaje, en pos de un objetivo final: ahondar en la comprensión del modo como los niños adquieren su primera lengua mediante la explotación de la teoría inferencial de gramáticas formales.Nuestras tres principales aportaciones son:1. Introducción de una nueva clase de lenguajes llamada Simple p-dimensional external contextual (SEC). A pesar de que las investigaciones en inferencia gramatical se han centrado en lenguajes regulares o independientes del contexto, en nuestra tesis proponemos centrar esos estudios en clases de lenguajes más relevantes desde un punto de vista lingüístico (familias de lenguajes que ocupan una posición ortogonal en la jerarquía de Chomsky y que son suavemente dependientes del contexto, por ejemplo, SEC).2. Presentación de un nuevo paradigma de aprendizaje basado en preguntas de corrección. Uno de los principales resultados positivos dentro de la teoría del aprendizaje formal es el hecho de que los autómatas finitos deterministas (DFA) se pueden aprender de manera eficiente utilizando preguntas de pertinencia y preguntas de equivalencia. Teniendo en cuenta que en el aprendizaje de primeras lenguas la corrección de errores puede jugar un papel relevante, en nuestra tesis doctoral hemos introducido un nuevo modelo de aprendizaje que reemplaza las preguntas de pertinencia por preguntas de corrección.3. Presentación de resultados basados en las dos previas aportaciones. En primer lugar, demostramos que los SEC se pueden aprender a partir de datos positivos. En segundo lugar, demostramos que los DFA se pueden aprender a partir de correcciones y que el número de preguntas se reduce considerablemente.Los resultados obtenidos con esta tesis doctoral suponen una aportación importante para los estudios en inferencia gramatical (hasta el momento las investigaciones en este ámbito se habían centrado principalmente en los aspectos matemáticos de los modelos). Además, estos resultados se podrían extender a diversos campos de aplicación que gozan de plena actualidad, tales como el aprendizaje automático, la robótica, el procesamiento del lenguaje natural y la bioinformática.With this dissertation, we bring together the Theory of the Grammatical Inference and Studies of language acquisition, in pursuit of our final goal: to go deeper in the understanding of the process of language acquisition by using the theory of inference of formal grammars. Our main three contributions are:1. Introduction of a new class of languages called Simple p-dimensional external contextual (SEC). Despite the fact that the field of Grammatical Inference has focused its research on learning regular or context-free languages, we propose in our dissertation to focus these studies in classes of languages more relevant from a linguistic point of view (families of languages that occupy an orthogonal position in the Chomsky Hierarchy and are Mildly Context-Sensitive, for example SEC).2. Presentation of a new learning paradigm based on correction queries. One of the main results in the theory of formal learning is that deterministic finite automata (DFA) are efficiently learnable from membership query and equivalence query. Taken into account that in first language acquisition the correction of errors can play an important role, we have introduced in our dissertation a novel learning model by replacing membership queries with correction queries.3. Presentation of results based on the two previous contributions. First, we prove that SEC is learnable from only positive data. Second, we prove that it is possible to learn DFA from corrections and that the number of queries is reduced considerably.The results obtained with this dissertation suppose an important contribution to studies of Grammatical Inference (the current research in Grammatical Inference has focused mainly on the mathematical aspects of the models). Moreover, these results could be extended to studies related directly to machine translation, robotics, natural language processing, and bioinformatics

    Learning SECp Languages from Only Positive Data

    Get PDF
    The eld of Grammatical Inference provides a good theoretical framework for investigating a learning process. Formal results in this eld can be relevant to the question of rst language acquisition. However, Grammatical Inference studies have been focused mainly on mathematical aspects, and have not exploited the linguistic relevance of their results. With this paper, we try to enrich Grammatical Inference studies with ideas from Linguistics. We propose a non-classical mechanism that has relevant linguistic and computational properties, and we study its learnability from positive data

    An Introduction to Grammatical Inference for Linguists

    Get PDF
    This paper is meant to be an introductory guide to Grammatical Inference (GI), i.e., the study of machine learning of formal languages. It is designed for non-specialists in Computer Science, but with a special interest in language learning. It covers basic concepts and models developed in the framework of GI, and tries to point out the relevance of these studies for natural language acquisition

    Experiments using semantics for learning language comprehension and production

    No full text
    Several questions in natural language learning may be addressed by studying formal language learning models. In this work we hope to contribute to a deeper understanding of the role of semantics in language acquisition. We propose a simple formal model of meaning and denotation using finite state transducers, and an algorithm that learns a meaning function from examples consisting of a situation and an utterance denoting something in the situation. We describe the results of testing this algorithm in a domain of geometric shapes and their properties and relations in several natural languages: Arabic, English, Greek, Hebrew, Hindi, Mandarin, Russian, Spanish, and Turkish. In addition, we explore how a learner who has learned to comprehend utterances might go about learning to produce them, and present experimental results for this task. One concrete goal of our formal model is to be able to give an account of interactions in which an adult provides a meaning-preserving and grammatically correct expansion of a child's incomplete utterance

    An Overview of How Semantics and Corrections Can Help Language Learning

    No full text
    International audienceWe present an overview of the results obtained with a computational model that takes into account semantics and corrections for language learning. This model is constructed with a learner and a teacher who interact in a sequence of shared situations. The model was tested with limited sublanguages of 10 natural languages in a common domain of situations

    Learning Simple External Contextual Languages from Positive Data

    No full text
    International audienceThe field of Grammatical Inference provides a good theoretical framework for investigating a learning process. Formal results in this field can be relevant to the question of first language acquisition. However, Grammatical Inference studies have been focused mainly on mathematical aspects, and have not exploited the linguistic relevance of their results. With this paper, we try to enrich Grammatical Inference studies with ideas from Linguistics. We propose a non-classical mechanism that has relevant linguistic and computational properties, and we study its learnability from positive data

    On Language Acquisition through Womb Grammars

    No full text
    International audienceWe propose to automate the field of language acquisition evaluation through Constraint Solving; in particular through the use of Womb Grammars. Womb Grammar Parsing is a novel constraint based paradigm that was devised mainly to induce grammatical structure from the description of its syntactic constraints in a related language. In this paper we argue that it is also ideal for automating the evaluation of language acquisition, and present as proof of concept a CHRG system for detecting which of fourteen levels of morphological proficiency a child is at, from a representative sample of the child's expressions. Our results also uncover ways in which the linguistic constraints that characterize a grammar need to be tailored to language acquisition applications. We also put forward a proposal for discovering in what order such levels are typically acquired in other languages than English. Our findings have great potential practical value, in that they can help educators tailor the games, stories, songs, etc. that can aid a child (or a second language learner) to progress in timely fashion into the next level of proficiency, and can as well help shed light on the processes by which languages less studied than English are acquired

    Children as Models for Computers: Natural Language Acquisition for Machine Learning

    No full text
    International audienceThis paper focuses on a subfield of machine learning, the so- called grammatical inference. Roughly speaking, grammatical inference deals with the problem of inferring a grammar that generates a given set of sample sentences in some manner that is supposed to be realized by some inference algorithm. We discuss how the analysis and formalization of the main features of the process of human natural language acquisition may improve results in the area of grammatical inference

    Speeding Up Syntactic Learning Using Contextual Information

    No full text
    International audienceIt has been shown in (Angluin and Becerra-Bonache, 2010, 2011) that interactions between a learner and a teacher can help language learning. In this paper, we make use of additional contextual information in a pairwise-based generative approach aiming at learning (situation,sentence)-pair-hidden markov models. We show that this allows a significant speed-up of the convergence of the syntactic learning. We apply our model on a toy natural language task in Spanish dealing with geometric objects
    corecore